NIST 

PUBLICATIONS 


A111D3  3S77D7 


Applied  and  nistir  89-4197 

Computational 

Mathematics 

Division 


Center  for  Computing  and  Applied  Mathematics 


Orthogonal  Distance  Regression 

Paul  T.  Boggs  and  Janet  R.  Donaldson 


November,  1989 


U.S.  DEPARTMENT  OF  COMMERCE 
National  Institute  of  Standards  and  Technology 
' Gaithersburg,  MD  20899 

QC 

100 

.056 

89-4197 

1989 


...oiii’UTE  OF  STANDARDS  & 
TECHNOLOGY 
Research  Information  Center 
Gaithersburg,  MD  20899 


Orthogonal  Distance  Regression  * 


Qtion 

O 

iip.  - 4m 


Paul  T.  Boggs  ^ Janet  R.  Donaldson  ^ 


Abstract 

Orthogonal  Distance  Regresson  (ODR)  is  the  name  given  to  the  com- 
putational problem  associated  with  finding  the  mciximum  likelihood  esti- 
mators of  parameters  in  measurement  error  models  in  the  case  of  normally 
distributed  errors.  We  examine  the  stable  and  efficient  algorithm  of  Boggs, 

Byrd  and  Schnabel  {SIAM  J.  Sci.  Stat.  Comput.,  8 (1987),  pp.  1052- 
1078)  for  finding  the  solution  of  this  problem  when  the  underlying  model 
is  assumed  to  be  nonlinear  in  both  the  independent  variable  and  the  pa- 
rameters. We  also  describe  the  associated  public  domain  software  package, 
ODRPACK.  We  then  review  the  results  of  a simulation  study  that  compares 
ODR  with  ordinary  least  squares  (OLS).  We  also  present  the  new  results 
of  an  extension  to  this  study.  Finally  we  discuss  the  use  of  the  asymptotic 
covariance  matrix  for  computing  confidence  regions  and  intervals  for  the  es- 
timated parameters.  Our  conclusions  are  that  ODR  is  better  than  OLS  for 
the  criteria  considered,  and  that  ODRPACK  can  provide  effective  solutions 
and  useful  statistical  information  for  nonlinear  ODR  problems. 

1.  Introduction 

Much  has  been  written  about  measurement  error  models  and  their  properties. 
The  definitive  modern  treatment  of  such  problems,  which  are  also  known  as  er- 
rors in  variables  and  generalized  least  squares,  is  given  by  Fuller  [14].  Because 
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of  the  complexity  of  these  problems,  most  of  the  work  in  this  area  has  concerned 
linear  models.  In  this  paper,  however,  we  are  primarily  concerned  with  nonlinear 
measurement  error  models  and  the  stable  and  efficient  algorithm  of  Boggs,  Byrd 
and  Schnabel  [5]  for  the  numerical  solution  of  the  resulting  nonlinear  optimization 
problem  that  we  refer  to  as  the  orthogonal  distance  regression  problem.  We  also 
discuss  the  implementation  of  this  algorithm  in  the  software  library  ODRPACK 
and  the  use  of  this  software  to  perform  certain  simulation  studies  that  help  to 
illuminate  the  differences  between  using  an  orthogonal  distance  criterion  for  es- 
timating the  parameters  of  a model  and  the  use  of  the  standard  ordinary  least 
squares  (OLS)  criterion.  (See  Boggs  et  al.  [8],  and  Boggs,  Donaldson  and  Schn- 
abel [7].)  We  conclude  with  another  simulation  study  (Boggs  and  Donaldson  [6]) 
designed  to  test  the  effectiveness  of  the  confidence  intervals  and  regions  obtained 
from  the  asymptotic  covariance  matrix. 

The  measurement  error  model  can  be  defined  in  a number  of  ways.  Here  we 
assume  that 

( ^ l,...,n, 

are  observed  random  variables  with  underlying  true  values 

(^n  yi'jt  ^ — 1,  . . . , n, 

where  x,  E 3?”^  and  yi  G 3?.  We  also  assume  that  6,-  is  the  random  error  associated 
with  X,,  that  e,-  is  the  random  error  associated  with  y,-,  and  that 


Xi  = Xi  — Si 
Yi  — yi  — Cj. 


Finally,  we  assume  that  yi  is  given  as  a function  of  x,-  and  a set  of  parameters 
(3  G i.e., 

Vi  = 
or 

(Throughout  this  paper,  we  use  bold  face  to  denote  the  matrix  or  column  vec- 
tor whose  components  are  the  corresponding  subscripted  variables,  e.g.,  f3  = 
(/3i, /?2, . . . , /dp)^  where  ' means  transpose.)  The  function  / may  be  either  linear 
or  nonlinear  in  its  arguments,  but  must  be  a smooth  function  of  these  arguments. 
At  this  point,  we  make  no  restrictions  on  the  distribution  of  x,  i.e.,  we  do  not 
specify  either  a functional  or  structural  model. 
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Any  procedure  that  estimates  /3  for  a measurement  error  model  should  take 
into  account  that  both  Xi  and  Yi  have  error.  We  do  this  by  using  an  orthogonal 
distance  to  define  the  distance  from  the  point  {Xi,Yi)  to  the  curve  /(x;/3), 
where  " denotes  a non-distinguished  value  of  the  variable.  That  is,  we  take 


rf  = mm{ej  + 6f} 

it, Si 

subject  to:  Yi  = f{Xi  + — ei,  z = 1, . . . , n. 


Note  that  here,  and  in  the  following  derivation,  we  assume  that  Xi  is  one  dimen- 
sional to  simplify  the  notation. 

Under  the  assumptions  that  Cj-  ~ N{0,cr^J  and  6i  ~ A^(0,(7U,  it  follows  that 
the  maximum  likelihood  estimate  ^ is  that  which  minimizes  the  sum  of  the  squares 
of  the  r,-.  Thus  /3  is  the  solution  of 


n 


min  + 

p,o,e  i=i 

(1.1) 

subject  to:  Yi  = f{Xi  -1-  6,-;  3)  - e,, 

z = 1, . . 

. ,n. 

(1.2) 

Since  the  constraints  (1.2)  can  be  used  to  eliminate  e from  (1.1),  this  constrained 
problem  is  equivalent  to  the  unconstrained  minimization  problem 


min 


t=l  ^ ^ 


The  final  form  of  the  optimization  problem  is  obtained  by  allowing  weights 
to  be  associated  with  the  observations.  We  thus  define  the  weighted  Orthogonal 
Distance  Regression  Problem  (ODR)  to  be 


min 


f2<v^\[f(X,  + S,;h-y.f  +dfsh 

t=l  ^ 


(1.3) 


where  Wi  > 0 and  d,-  > 0.  The  problem  in  this  form  can  arise  in  statistical 
applications,  and  in  curve  fitting  and  other  applications  where  the  errors  and 
their  associated  weights  have  no  statistical  connotation.  In  the  remainder  of  this 
paper,  however,  we  deal  with  problems  where  Wi  = and  di  = cr^Jcrs,- 

In  §2  we  discuss  the  stable  and  efficient  numerical  algorithm  in  Boggs,  Byrd 
and  Schnabel  [5]  for  solving  this  problem.  This  algorithm  requires  the  same  work 
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per  iteration  as  the  corresponding  procedure  for  solving  an  OLS  problem.  The 
algorithm  has  been  implemented  in  a Fortran  subroutine  library  called  ODRPACK 
(Boggs  et  al.  [3])  that  is  publically  available.  The  features  of  this  package  are  also 
discussed. 

In  §3  we  review  the  results  of  a simulation  study  (Boggs  et  al.  [8])  that 
was  designed  to  examine  the  differences  between  ODR  and  OLS  solutions.  This 
“baseline”  study  assumes  that  the  ratios  d,-  = are  known  and  constant, 

and  Wi  = 1 for  all  i.  Under  the  chosen  criteria  ODR  is  shown  to  be  superior  to 
OLS.  We  then  describe  an  extended  study  (Boggs,  Donaldson  and  Schnabel  [7]) 
that  relaxes  the  assumption  that  the  d,-  are  known  exactly.  In  this  case,  if  the 
ratios  are  known  to  within  a factor  of  10,  ODR  is  still  preferred. 

Fuller  [14]  shows  how  to  derive  the  asymptotic  covariance  matrix  for  the  ODR 
problem.  In  §4  we  discuss  the  results  of  a further  simulation  study  (Boggs  and 
Donaldson  [6])  designed  to  ascertain  the  effectiveness  of  confidence  regions  and 
intervals  constructed  based  on  this  matrix. 

The  results  reported  here  allow  the  conclusion  that  ODR  is  preferred  over 
OLS  when  the  ratios  d,  are  reasonably  well  known.  Furthermore,  the  existence  of 
the  algorithm  and  software  described  in  §2  means  that  solving  an  ODR  problem 
is  just  as  easy  as  solving  a nonlinear  OLS  problem,  and  that  useful  statistical 
information  can  be  readily  produced. 


2.  Algorithm  and  Software 

The  ODR  problem  (1.3)  has  been  in  the  computing  literature  for  several  years, 
and  several  procedures  for  its  solution  have  been  proposed,  cf.,  Britt  and  Luecke 
[9],  Powell  and  MacDonald  [15],  Schnell  [18],  and  Schwetlick  and  Tiller  [19].  In 
this  section,  we  review  the  recent  stable  and  efficient  algorithm  of  Boggs,  Byrd 
and  Schnabel  [5],  and  its  implementation  in  ODRPACK.  (See  Boggs  et  al.  [3]  and 
[4].)  The  algorithm  is  based  on  a trust-region  strategy  that  yields  a Levenberg- 
Marquardt  type  step.  (See,  e.g.,  Dennis  and  Schnabel  [10].)  The  trust-region 
approach  allows  for  a much  more  effective  procedure  for  computing  the  so-called 
Levenberg-Marquardt  parameter. 

To  describe  the  algorithm  we  again  assume  that  .Y,  is  one  dimensional,  which 
keeps  the  notation  simpler.  (The  extension  to  m dimensions  is  straightforward, 
and  ODRPACK  allows  this.)  We  first  show  that  (1.3)  can  be  viewed  as  an  ex- 
tended OLS  problem.  We  then  review  the  trust-region  approach  for  solving  non- 
linear OLS  problems,  and  show  how  the  special  structure  of  the  extended  problem 
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can  be  exploited  to  produce  an  efficient  scheme.  In  fact  the  efficiency  is  compara- 
ble to  that  of  a trust-region  algorithm  for  solving  the  OLS  problem  obtained  from 
the  assumption  that  there  are  no  errors  in  We  conclude  this  section  with  a 
brief  description  of  ODRPACK. 

Let 


Wi 


9i+n{^i  — 


WidiSi 


i = 1 , . . . , n 
z = 


and  denote  the  2-norm  of  g by 

Then  (1.3)  becomes 


2n 


i=\ 


mm 


2n 


(3,6  ,=i 


which  is  an  OLS  problem  with  n -f  p parameters  and  2n  observations. 
Let 

I) 


d = 


and  denote  by  J G 3?2nx(n-f-p)  Jacobian  of 


gi^)  = • ,gi0)2n)\ 


''  06/ 

Then  the  linearized  model  at  the  current  iterate,  given  by  the  first  two  terms 
of  the  Taylor  series,  is 

g{0^)^J{0^)s=g^  + J‘^s  (2.1) 


where 

s = 0-0\ 


(2.1)  can  then  be  used  to  compute  a step,  5“^,  from  which  the  next  iterate,  0'^,  is 
found,  i.e.,  0^  = 0^^  s^.  The  obvious  step  that  could  be  calculated  from  (2.1) 
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is  the  so-called  Gauss-Newton  step  where  is  simply  the  (linear  least  squares) 
solution  of 

min  11^"  -h  J^sf  . 

if 

But  even  when  the  length  of  this  step  is  controlled,  e.g.,  by  a line  search,  the 
resulting  algorithm  has  not  proven  itself  to  be  competetive  with  other  procedures. 
(See,  e.g.,  Dennis  and  Schnabel  [10].)  If,  however,  we  explicitly  recognize  that  the 
length  of  the  step  must  be  controlled  by  the  extent  to  which  (2.1)  is  a good 
model  of  flr,  then  a much  better  procedure  arises. 

Specifically,  the  trust-region  algorithm  chooses  the  current  step  as  the  solu- 
tion to 

minslli^'^d- .2.2) 
subject  to:  ||s|l  < r ; 

where  r is  the  trust-region  radius.  The  value  of  r defines  the  region  where  (2.1)  is 
a good  model  of  g and  should  not  be  thought  of  in  a statistical  sense.  The  value 
of  r is  automatically  adjusted  at  each  iteration  based  on  whether  \[g{9^  + 5^)11  < 
llflr‘^11  and  on  a comparison  of  ||gr(0‘^ -|- s^)||  with  -f  The  actual  tests 

employed  in  modern  trust-region  algorithms  have  been  empirically  determined 
over  the  years.  (See  Dennis  and  Schnabel  [10]  for  a general  discussion,  and  Boggs, 
Byrd  and  Schnabel  [5]  and  Boggs  et  al.  [3]  for  specific  details.) 

An  outline  of  the  trust-region  algorithm  is  as  follows. 

Given  an  initial  approximation  0^  and  an  estimate  of  the  trust-region,  r: 


1.  Solve  (2.2)  for  5“^ 

2.  Set  0+  :=  + s" 

3.  Test:  If  g{9'^)  < then 

• set  0^^  :=  0^ 

• adjust  r if  necessary 

• check  convergence 

else 


• reduce  r 

4.  Go  to  Step  1 
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In  the  construction  of  portable  software  that  is  meant  to  be  used  for  a variety 
of  problems,  it  is  common  to  solve  (2.2)  by  first  doing  a QR  factorization  of  the 
Jacobian  matrix  J . (See,  e.g.,  Dennis  and  Schnabel  [10]  or  Stewart  [20].)  For  a 
dense  n x p matrix  the  QR  factorization  requires  0{np^)  operations.  Since  J is 
2n  X (n  + p),  it  therefore  requires  0{n{n  + p)^)  = 0{n^)  operations  to  compute 
its  QR  factorization.  The  matrix  J has  a special  structure,  however,  that  can  be 
exploited  to  solve  (2.2)  more  efficiently.  In  particular,  let  denote  the  first  n 
components  of  g and  g2  the  last  n components.  Then 


where 


and 


J = 


/ 22± 

d$  dS 

dfJo  dg, 

\ d$  d6  J 

/ % iT  \ 

dp 

\ 0 D ) 


rr  ..  f , 

H = diag  I Wi ^ = 1, . . . , n 


D = diag  {wid,,  i = 1, . . . ,n}  . 


(2.3) 


By  exploiting  the  structure  provided  by  the  existence  of  the  zero  block  and  the  two 
diagonal  blocks,  H and  T),  the  amount  of  work  per  iteration  required  to  solve  an 
ODR  problem  can  be  reduced  from  O(n^)  to  0{np^)  operations  plus,  on  average, 
less  than  two  evaluations  of  the  model  function  /.  Since  there  are  typically  many 
more  observations  than  parameters,  this  difference  can  have  a profound  influence 
on  the  size  of  problem  that  can  be  practicably  solved.  We  point  out  that  if  the 
errors  in  A",  are  ignored  and  the  resulting  OLS  problem  is  solved,  the  amount  of 
work  per  iteration  required  is  also  0{np^)  operations  plus  an  average  of  less  than 
two  evaluations  of  /.  Using  our  algorithm,  therefore,  an  ODR  problem  can  be 
solved  as  efficiently  as  an  OLS  problem.  See  Boggs  et  al.  [3]  for  the  details. 

We  conclude  this  section  with  a brief  description  of  the  Fortran  subroutine 
library,  ODRPACK,  that  we  have  written  to  implement  the  ODR  algorithm  de- 
scribed above.  A complete  description  is  contained  in  Boggs  et  al.  ([3]  and  [4]). 
The  package  is  portable  and  has  been  successfully  run  on  many  machines  from 
PCs  to  supercomputers. 
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ODRPACK  is  designed  to  solve  both  OLS  and  ODR  problems,  handling  many 
levels  of  user  sophistication  and  problem  difficulty. 

• It  is  easy  to  use,  providing  two  levels  of  user  control  of  the  computations, 
extensive  error  handling  facilities,  optional  printed  reports  summarizing  the 
iterations  and/or  the  final  results,  and  no  size  restrictions  other  than  effective 
machine  size. 

• The  necessary  Jacobian  matrices  are  approximated  numerically  if  they  are 
not  supplied  by  the  user.  The  correctness  of  user  supplied  derivatives  can 
also  be  verified  by  the  derivative  checking  procedure  provided. 

• Both  weighted  and  unweighted  analysis  can  be  performed.  Unequal  weights 
ty,,  i = 1, . . . , n,  are  allowed,  and  a different  value  of  i — 1, . . . , n and 
k — 1, . . . , m,  can  be  specified  for  each  component  of  X.  A feature  has  also 
been  incorporated  that  allows  the  same  weight  to  apply  to  many  components 
without  having  to  explicitly  enter  it  more  than  once. 

• Subsets  of  the  unknowns  can  be  treated  as  constants  with  their  values  held 
fixed  at  their  input  values,  allowing  the  user  to  examine  the  results  obtained 
by  estimating  subsets  of  the  unknowns  of  a general  model  without  rewriting 
the  code  that  computes  /. 

• The  covariance  matrix  and  the  standard  errors  for  the  model  parameter 
estimators  are  optionally  provided.  (See  §4.) 

• ODRPACK  automatically  compensates  for  problems  in  which  the  model 
parameters  and/or  unknown  errors  in  the  independent  variables  vary  widely 
in  magnitude. 

A copy  of  ODRPACK  may  be  obtained  free  of  charge  from  the  authors. 


3.  Simulation  Studies 

Researchers  including  Fuller  [14]  and  Reilman  tt  al.  [17]  provide  results  that  give 
theoretical  conditions  for  choosing  ODR  over  OLS  and  vice  versa.  These  results, 
however,  are  only  for  straight  line  models,  and,  to  our  knowledge,  there  are  no 
similar  results  for  either  general  linear  models  or  for  nonlinear  models.  This  state 
of  affairs  provides  the  motivation  for  a baseline  simulation  study  that  compares  the 
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performance  of  ODR  and  OLS  on  nonlinear  models.  These  results  are  contained 
in  Boggs  et  al.  [8]. 

The  central  assumption  in  Boggs  et  al.  [8]  is  that  the  ratios 

di  = (T,Jas, 

are  known  exactly.  We  first  review  this  study,  and  then  present  results  that  extend 
this  work  to  the  case  when  the  d,  are  not  known  exactly. 

The  existence  of  ODRPACK  as  described  in  §2  not  only  makes  these  studies 
possible,  but  also  makes  them  of  practical  interest.  Since  this  code  is  publically 
available,  other  researchers  and  practitioners  can  solve  their  own  ODR  problems 
and  conduct  similar  simulation  studies  on  other  models. 

The  study  in  Boggs  et  al.  [8]  is  the  first  to  consider  models  that  are  nonlinear 
in  both  (3  and  a:,  although  studies  involving  linear  and  quadratic  models  have 
been  reported  in  Amemiya  and  Fuller  [1],  Schnell  [18],  and  Wolter  and  Fuller  [21], 
for  example.  Our  procedure  is  to  compare  the  bias,  variance,  and  mean  square 
error  of  parameter  estimates  and  function  estimates  obtained  using  both  ODR 
and  OLS.  We  use  four  forms  of  the  model  function  / with  two  parameter  sets  per 
function.  We  generate  errors  for  each  using  d,  = d*  for  seven  values  of  d*  ranging 
from  0.1  to  oo,  replicating  each  problem  configuration  500  times.  Each  of  the 
resulting  91,000  problems  is  then  solved  using  ODR  and  OLS.  (Recall  that  the 
correct  values  of  d,  are  used  to  obtain  the  ODR  solutions.)  Both  the  ODR  and 
OLS  computations  are  done  by  ODRPACK. 

To  be  more  specific,  we  consider  a straight  line  model,  a quadratic  model,  an 
exponential  model,  and  a sine  model.  The  two  true  parameter  sets  for  each  of 
these  models  are  chosen  so  that  the  maximum  slope  of  / over  the  range  of  the  X 
data  is  1 and  10,  respectively.  The  x,-  are  taken  as  51  equally  spaced  points  on 
[-1,1];  thus  this  study  is  concerned  with  a functional  model.  The  errors  6 and  e 
are  generated  as  normally  distributed  and  adjusted  so  that  the  expected  sum  of 
the  squared  errors  is  constant  over  the  seven  values  of  d* . The  values  of  d*  that 
we  consider  are  0.1,  0.5,  1.0,  2.0,  10.0,  100.0,  and  oo,  where  d*  = oo  indicates  that 
there  are  no  errors  in  the  values  Xi. 

The  main  conclusion  of  this  study  is  that  ODR  is  better  than  OLS  for  all  of 
the  criteria  that  we  consider. 

• For  parameter  bias,  ODR  is  no  worse  than  OLS  in  98%  of  the  cases  studied 
and  is  a clear  winner  50%  of  the  time. 

• For  parameter  variance  and  mean  square  error,  ODR  is  no  worse  in  98%  of 
the  cases  and  is  appreciably  better  in  23%. 
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• Similar  results  are  obtained  for  the  function  estimates. 

• For  the  straight  line  model,  we  do  not  see  a preference  for  OLS  over  ODR,  as 
is  predicted  by  Reilman  et  al.  [17].  ODR  produces  a significantly  lower  bias 
than  OLS,  but  the  variance  and  mean  square  error  values  are  only  slightly 
smaller  for  OLS.  We  thus  conclude  that  ODR  is  still  preferable  in  this  case. 

• In  accordance  with  the  observations  of  Wolter  and  Fuller  [21]  and  Anderson 
[2]  our  computational  results  do  not  reflect  the  infinite  moments  of  ODR. 

In  practice,  the  ratios  di  are  seldom  known  exactly.  Thus  it  is  of  interest  to 
extend  the  above  results  to  the  case  where  an  approximate  value  of  each  d,  is  used 
in  the  ODR  procedure.  The  results  of  such  a study  are  presented  next. 

To  make  this  second  Monte  Carlo  study  tractable,  we  reduce  its  scope  some- 
what. We  consider  the  same  seven  functions  as  described  above,  but  with  only 
one  parameter  set  per  function.  For  this  study,  the  ratios  used  to  compute  the 
generated  errors  are  d*  equal  to  0.1,  1.0,  and  10.0,  and  the  values  of  the  error 
ratios  that  are  used  by  ODRPACK  in  estimating  the  parameters  are  d,  = jd*, 
where  7 has  the  values  0.1,  0.5,  1.0,  2.0  and  10.  That  is,  we  investigate  cases  where 
the  error  ratio  used  in  the  estimation  procedure  is  correct  to  within  a factor  of 
10.  We  again  compare  the  ODR  results  to  an  OLS  flt  using  the  same  criteria  as 
in  the  baseline  study  reported  above. 

We  give  a complete  description  of  the  study  and  the  results  in  Boggs,  Donald- 
son and  Schnabel  [7].  In  summary,  when  the  d,  are  correct  to  within  a factor  of 
10.0,  then: 

t The  bias  of  the  parameter  estimates  using  ODR  is  no  worse  than  the  bias 
using  OLS  in  88%  of  the  cases  studied,  and  is  significantly  smaller  59%  of 
the  time. 

• The  variance  of  the  parameter  estimates  using  ODR  is  no  worse  in  90%  of 
the  cases  and  is  smaller  in  40%. 

• The  mean  square  error  of  the  parameter  estimates  using  ODR  is  no  worse 
in  87%  of  the  cases  and  is  smaller  in  50%. 

Similar  results  are  observed  for  the  function  estimates.  In  addition,  there  appears 
to  be  some  support  for  overestimating  rather  than  underestimating  the  d,. 

We  conclude  from  these  two  studies  that  if  the  d,  are  known  to  within  a factor 
of  10,  then  ODR  is  preferable  to  OLS.  In  the  next  section,  however,  we  describe 
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how  estimated  confidence  regions  may  break  down  if  the  d,-  are  not  known  to 
within  a factor  of  2. 


4.  The  Asymptotic  Covariance  Matrix 

In  many  parameter  estimation  problems,  confidence  regions  and/or  confidence 
intervals  for  the  estimated  parameters  are  needed.  In  the  linear  OLS  case,  these 
can  be  obtained  exactly.  In  the  nonlinear  OLS  case  and  in  ODR,  the  computation 
of  exact  confidence  regions  and  intervals  is  computationally  intractable,  and  so 
approximate  methods  must  be  used. 

Fuller  [14]  derives  the  asymptotic  form  of  the  covariance  matrix  for  the  esti- 
mated parameters  of  measurement  error  models.  This  can  be  used  to  construct 
approximate  confidence  regions  and  intervals.  In  Boggs  and  Donaldson  [6]  we 
discuss  the  efficient  computation  of  this  covariance  matrix  in  the  context  of  the 
algorithm  and  software  of  §2.  Using  a Monte  Carlo  simulation  study,  we  then  as- 
sess the  quality  of  confidence  regions  and  intervals  computed  from  this  covariance 
matrix.  In  this  section  we  review  that  work. 

The  asymptotic  covariance  matrix  can  be  defined  using  the  extended  OLS 
problem  of  §2,  namely, 

2n 

(4.1) 

0 ^=l 


and  the  corresponding  Jacobian  matrix  J 6 with  (f,  j)  element  defined 

by 

j - ^ 

''  de, ' 

Let  0 be  the  estimate  of  0 from  (4.1)  and  let 


where 


3 _ gi0yOg{0) 
n p 


n = 


( 0 \ 

0 D'  j ’ 


w = 


diag  {wi, 


z = l,...,n} 
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and  D is  given  by  (2.3).  Also,  assume  that  O correctly  reflects  the  distribution 
of  the  errors  e and  6 as  specified  in  §1.  Then  the  covariance  matrix  is 

V = [J'OJ]~^ , 


which  Fuller  [14]  proves  is  asymptotically  correct  as  the  error  variance  tends  to 
zero. 

The  matrix  V can  be  used  to  construct  approximate  confidence  regions  and 
intervals  for  the  estimated  parameters  0.  We  call  this  method  the  linearization 
method  since  it  is  based  on  the  assumption  that  g is  adequately  approximated  by 
a linear  function  of  the  form  (2.1)  in  the  neighborhood  of  0,  The  goodness  of  the 
resulting  linearized  regions  and  linearized  intervals  depends  on  the  nonlinearity  of 
/ and  the  size  of  . Donaldson  and  Schnabel  [11]  show  that  the  linearized  con- 
fidence intervals  for  nonlinear  OLS  problems  appear  to  be  reasonable  in  practice, 
but  that  the  linearized  confidence  regions  can  be  inadequate.  (More  accurate,  but 
comensurately  more  expensive  approximations  can  be  obtained,  as  described  by, 
e.g.,  Donaldson  and  Schnabel  [11]  and  Efron  [13].) 

To  assess  linearized  confidence  regions  and  intervals  in  the  context  of  ODR 
problems,  we  conduct  a Monte  Carlo  simulation  similar  to  that  in  Donaldson 
and  Schnabel  [11].  Our  study  consists  of  four  example  problems.  Three  of  these 
problems  exist  in  the  literature.  They  are  Fuller  [14,  Example  3.2.2],  Ratkowsky 
[16,  Example  6.11],  and  Draper  and  Smith  [12,  Chapter  10,  Problem  E].  The 
fourth  example  is  taken  from  a consulting  session  involving  a “psychophysical” 
experiment  evaluating  the  ability  of  human  subjects  to  perceive  a visual  signal  as 
a function  of  the  intensity  of  the  signal. 

For  each  example  we  generate  500  sets  of  “observed”  data  with  normal  random 
error  and  true  error  ratios  d*.  We  then  solve  each  of  the  resulting  problems  using 
ODRPACK.  As  in  the  second  study  reported  in  §3,  this  study  also  solves  each 
problem  using  a range  of  assumed  error  ratios  d,  = 7^*,  where  7 = 0.1,  0.5,  1.0, 
2.0,  and  10.0.  For  each  of  the  500  data  sets,  we  also  construct  the  95%  confidence 
regions  and  intervals  for  the  estimate  0 using  the  linearization  method.  Finally, 
we  calculate  the  observed  coverage,  which  is  the  percentage  of  cases  in  which  the 
true  value  is  actually  contained  in  the  computed  region  or  interval.  Our  results, 
presented  in  Boggs  and  Donaldson  [6],  are  summarized  as  follows: 

• When  the  di  are  correct,  we  obtain  surprisingly  good  estimates  of  the  con- 
fidence regions  and  intervals  as  compared  to  the  results  of  Donaldson  and 
Schnabel  [11],  who  often  found  the  observed  coverage  for  a nominal  95% 
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confidence  region  to  be  less  than  80%.  We  assume  that  this  is  due  to  our 
choice  of  examples  and  is  not  a general  property  of  ODR. 

• The  confidence  region  estimates  significantly  degrade,  even  when  the  esti- 
mated values  of  d,-  are  off  by  no  more  than  a factor  of  2. 

• The  confidence  intervals  when  the  d,  are  correct  are  also  quite  good  and 
remain  good  when  the  d,  are  only  off  by  a factor  of  2. 

• The  confidence  intervals  significantly  degrade  when  the  d,  are  off  by  a factor 
of  10. 

• Confidence  regions  and  intervals  are  better  for  ^ than  for  6. 

• In  the  case  of  confidence  intervals  for  (3,  overestimation  of  the  d,  is  preferable 
to  underestimation.  For  6 the  reverse  is  true.  This  occurs  because,  as  the  d, 
are  increased,  6 becomes  more  restricted,  whereas  as  the  d,  are  decreased, 
an  artifically  small  value  of  a is  produced. 

Note  that  V is  the  covariance  matrix  for  the  entire  parameter  vector  0.  In 
practice,  however,  one  is  typically  only  interested  in  the  covariance  matrix  for  the 
model  parameters  $.  This  is  the  upper  left  p x p part  of  V , which  we  denote 
V p.  Just  as  the  special  structure  of  J allows  us  to  create  a fast  algorithm  for 
solving  (2.2),  that  same  structure  permits  us  to  equip  ODRPACK  with  an  efficient 
and  stable  means  of  calculating  V p.  The  formulas  are  contained  in  Boggs  and 
Donaldson  [6]  and  are  based  on  the  work  in  Boggs,  Byrd  and  Schnabel  [5]. 

The  overall  conclusion  from  this  third  study  is  that  the  linearized  confidence 
regions  and  especially  confidence  intervals  have  some  validity  for  ODR  problems 
when  the  d,-  are  known  to  within  a factor  of  2.  Furthermore,  the  computations 
needed  to  obtain  these  linearized  regions  and  intervals  for  /3  are  cheap,  and  are 
automatically  produced  by  ODRPACK.  Thus  we  advocate  the  use  of  linearized 
confidence  regions  and  intervals  for  measurement  error  models  in  the  same  spirit, 
and  with  the  same  caveats,  that  accompany  their  use  in  nonlinear  OLS  problems. 
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